Skip to content

Fix stable CI test harness failures#1986

Draft
pranaygp wants to merge 19 commits into
stablefrom
pranaygp/codex/fix-stable-ci
Draft

Fix stable CI test harness failures#1986
pranaygp wants to merge 19 commits into
stablefrom
pranaygp/codex/fix-stable-ci

Conversation

@pranaygp
Copy link
Copy Markdown
Contributor

@pranaygp pranaygp commented May 14, 2026

Summary

Fixes CI stability issues observed on the current stable branch:

  • Avoids deleting newly-added workflow files while the dev server can still rebuild against imports that reference them. The dev-test cleanup now keeps placeholders in place for dev-server rebuilds and avoids the missing-file race.
  • Speeds up local workflow port detection by probing with a real libuv-backed server listen attempt instead of waiting on slow HTTP probes.
  • Makes generated workflow health endpoints respond to HEAD, so health checks no longer need to probe with POST.
  • Wraps generated framework route exports so SvelteKit/Astro route functions keep the workflow route wrapper instead of leaking workflow queue triggers onto framework-owned functions.
  • Materializes manual webhook response bodies before returning from the webhook handler. In the Nitro Vercel prod failure, the workflow run completed in ~5s and all hooks were received, but the test still timed out at 120s; this keeps the HTTP response from depending on a secondary cross-context body stream after the handler returns.
  • Waits for step return-value stream serialization before recording step_completed. The Nuxt failure had readableStreamWorkflow complete before stream chunks were persisted, so the client observed an empty stream.
  • Bounds Vercel stream write/close mutations with an abort signal. The SvelteKit DurableAgent failure showed writeToolOutputToUI started but never recorded step_completed, leaving the workflow running until Vitest timed out; stuck stream writes now fail/retry instead of pinning the step forever.
  • Applies the same overall timeout to queue-based health checks, including queue delivery and stream-open operations. The Astro prod failure timed out inside healthCheck() even though the following CLI health check passed, which showed the health-check API could hang outside its advertised timeout.
  • Gives remote Vercel e2e CLI inspections a Vercel-specific timeout budget. The Vite failure had both addTenWorkflow runs complete in ~5s, then workflow inspect --withData was killed by the harness' old 20s subprocess timeout while fetching/decrypting remote run data.
  • Makes sleep e2e assertions use persisted event timing instead of Date.now() values returned from replayed workflow code. The Express failure showed run_started to wait_completed was >10s, but the replayed return value measured only the latter portion of the event stream.
  • Makes hook e2e tests wait for the hook token to actually be registered/released/re-owned instead of relying on fixed sleeps. Recent Hono/Nitro failures tried to read or reuse hooks before the runtime had persisted the expected hook state.
  • Keeps the remote addTenWorkflow test timeout aligned with observed Vercel prod queue/cold-start latency where the workflow completed after the old 60s test timeout.

Includes a changeset for the touched packages.

CI notes

  • Run 25894395244 failed only E2E Vercel Prod Tests (nitro) plus the aggregate required check. The failed webhookWorkflow run wrun_01KRMJH5E2V02Y9KC4NJ0EHB0P was completed; event timeline reached run_completed at +5.3s, so the remaining failure was in the webhook HTTP response path, not workflow execution.
  • Run 25895081248 fixed Nitro and all local e2e jobs passed. The only app-specific failure was E2E Vercel Prod Tests (vite), where both failed addTenWorkflow runs were already completed (wrun_01KRMKS6SH17MSH902CATGVGYH and wrun_01KRMKTGK9AYX2S228DAWD4GHV reached run_completed in ~5s). The failure was the e2e harness killing workflow inspect --withData with SIGTERM after 20s.
  • Run 25898693068 fixed Vite. Remaining failures were E2E Vercel Prod Tests (express) and E2E Vercel Prod Tests (nitro): Express failed sleepingWorkflow because replayed Date.now() returned a 5509ms delta while the event timeline showed run_started at +0.5s and wait_completed at +11.3s; Nitro failed hookDisposeTestWorkflow because workflow 2 started before workflow 1 had disposed the shared token, producing a correct hook_conflict.
  • Run 25902464337 fixed Vite, Express, and Nitro. Remaining failure was E2E Vercel Prod Tests (nuxt): readableStreamWorkflow returned an empty stream even though the step/run completed; the step event timeline completed in ~1s while the stream-producing step should take ~10s, showing step_completed was racing ahead of return stream serialization.
  • Run 25905802133 fixed Nuxt. Remaining failures were E2E Local Dev Tests (hono - stable) and E2E Local Prod Tests (nitro - stable), both failing hookWorkflow with HookNotFoundError after the fixed 5s hook-registration sleep, plus E2E Vercel Prod Tests (sveltekit), where DurableAgent single tool call timed out while run wrun_01KRN8AVPVXZEGPVV2Z61KW1MY was still running and its event timeline stopped at writeToolOutputToUI step_started.
  • Run 25913362448 fixed the Hono/Nitro hook races and SvelteKit stream hang. Remaining failure was E2E Vercel Prod Tests (astro): the direct queue-based health check timed out after 60s, while the following CLI health check passed in ~3s. That exposed that healthCheck(..., { timeout }) did not bound world.queue() or readFromStream() before response polling.
  • Run 25913963665 on 5daac2c87 passed, including the previously failing Hono/Nitro hook lanes, SvelteKit DurableAgent lane, Astro prod health-check lane, and the aggregate required check: https://github.com/vercel/workflow/actions/runs/25913963665

Validation

  • fnm exec --using v22.18.0 pnpm exec biome check packages/core/src/runtime/helpers.ts packages/core/src/runtime/helpers.test.ts packages/core/e2e/e2e.test.ts packages/world-vercel/src/streamer.ts packages/world-vercel/src/streamer.test.ts .changeset/stable-ci-e2e-cleanup.md (only pre-existing warnings in e2e.test.ts)
  • fnm exec --using v22.18.0 pnpm vitest run packages/core/src/runtime/helpers.test.ts packages/world-vercel/src/streamer.test.ts packages/core/src/runtime/step-handler.test.ts packages/core/src/writable-stream.test.ts packages/core/src/step/writable-stream.test.ts
  • fnm exec --using v22.18.0 pnpm --filter @workflow/core --filter @workflow/world-vercel build
  • git diff --check
  • Earlier validation on this PR also covered @workflow/utils tests, builder package builds, SvelteKit/Astro production builds, Fastify dev rebuild smoke coverage, and Nitro Vercel preset builds.

@pranaygp pranaygp requested a review from a team as a code owner May 14, 2026 18:21
Copilot AI review requested due to automatic review settings May 14, 2026 18:21
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment May 15, 2026 10:55am
example-nextjs-workflow-webpack Ready Ready Preview, Comment May 15, 2026 10:55am
example-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-astro-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-express-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-fastify-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-hono-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-nitro-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-nuxt-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-sveltekit-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-tanstack-start-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workbench-vite-workflow Ready Ready Preview, Comment May 15, 2026 10:55am
workflow-docs Ready Ready Preview, Comment, Open in v0 May 15, 2026 10:55am
workflow-swc-playground Ready Ready Preview, Comment May 15, 2026 10:55am
workflow-tarballs Ready Ready Preview, Comment May 15, 2026 10:55am
workflow-web Ready Ready Preview, Comment May 15, 2026 10:55am

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 14, 2026

🦋 Changeset detected

Latest commit: 5daac2c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 23 packages
Name Type
@workflow/astro Patch
@workflow/builders Patch
@workflow/core Patch
@workflow/nest Patch
@workflow/next Patch
@workflow/sveltekit Patch
@workflow/utils Patch
@workflow/world-vercel Patch
workflow Patch
tarballs Patch
@workflow/cli Patch
@workflow/nitro Patch
@workflow/rollup Patch
@workflow/vite Patch
@workflow/vitest Patch
@workflow/web-shared Patch
@workflow/web Patch
@workflow/world-testing Patch
@workflow/errors Patch
@workflow/world-local Patch
@workflow/world-postgres Patch
@workflow/ai Patch
@workflow/nuxt Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
✅ ▲ Vercel Production 901 0 67 968
✅ 💻 Local Development 970 0 86 1056
✅ 📦 Local Production 970 0 86 1056
✅ 🐘 Local Postgres 970 0 86 1056
✅ 🪟 Windows 88 0 0 88
❌ 🌍 Community Worlds 136 86 0 222
✅ 📋 Other 492 0 36 528
Total 4527 86 361 4974

❌ Failed Tests

🌍 Community Worlds (86 failed)

mongodb (12 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KRNMAVFM2S98X2BA2PREKMJH
  • webhookWorkflow | wrun_01KRNMB0Y2NFT34YF3YR98EEWZ
  • sleepingWorkflow | wrun_01KRNMB73XRE1VJFR0SVT2BC1R
  • outputStreamWorkflow no startIndex (reads all chunks)
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KRNMECSMPYSHWZ1Z3CCV6WWK
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KRNMJTXJT0X3TGPQ8T6XBE0B
  • pages router sleepingWorkflow via pages router
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KRNMS5SRRTWDQW3357XX328T

redis (9 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KRNMAVFM2S98X2BA2PREKMJH
  • sleepingWorkflow | wrun_01KRNMB73XRE1VJFR0SVT2BC1R
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KRNMJTXJT0X3TGPQ8T6XBE0B
  • pages router sleepingWorkflow via pages router
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KRNMS5SRRTWDQW3357XX328T

turso (65 failed):

  • addTenWorkflow | wrun_01KRNM9GXT8PENH5SGRTMMZKBN
  • addTenWorkflow | wrun_01KRNM9GXT8PENH5SGRTMMZKBN
  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KRNMB8AJNA1PNBEX7HF1QMJ1
  • should work with react rendering in step
  • promiseAllWorkflow | wrun_01KRNM9S6ZQPEM075AHVM4GT7G
  • promiseRaceWorkflow | wrun_01KRNM9Z9DKQFPAK84FKCWKC8X
  • promiseAnyWorkflow | wrun_01KRNMA1EVMAKHJSMM7NZG1YV6
  • importedStepOnlyWorkflow | wrun_01KRNMBK9WFND071F3KARBG0NP
  • readableStreamWorkflow | wrun_01KRNMA63JMRFSSRQNXER9RB6Z
  • hookWorkflow | wrun_01KRNMAKCNR7BP6E7858RHQA1S
  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KRNMAVFM2S98X2BA2PREKMJH
  • webhookWorkflow | wrun_01KRNMB0Y2NFT34YF3YR98EEWZ
  • sleepingWorkflow | wrun_01KRNMB73XRE1VJFR0SVT2BC1R
  • parallelSleepWorkflow | wrun_01KRNMBPC9XMBS30EYGFH7SG0F
  • nullByteWorkflow | wrun_01KRNMBSRFFGMWN9CBHFC1GPDM
  • workflowAndStepMetadataWorkflow | wrun_01KRNMBVY1K96QNWB20W0XW3BG
  • outputStreamWorkflow no startIndex (reads all chunks)
  • outputStreamWorkflow positive startIndex (skips first chunk)
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KRNMECSMPYSHWZ1Z3CCV6WWK
  • fetchWorkflow | wrun_01KRNMETPX2PAY12CA0M931XKQ
  • promiseRaceStressTestWorkflow | wrun_01KRNMEY66CVZAAVCRNS1N8R32
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • error handling not registered WorkflowNotRegisteredError fails the run when workflow does not exist
  • error handling not registered StepNotRegisteredError fails the step but workflow can catch it
  • error handling not registered StepNotRegisteredError fails the run when not caught in workflow
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KRNMJDRQGDSEFMFX88TEPAW4
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KRNMJTXJT0X3TGPQ8T6XBE0B
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KRNMKB64ZYKT4E1NYX52E72V
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KRNMKVDYRKWM5JD2J52RVZ8G
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument | wrun_01KRNMM6A9WM9Z693NTQWT08V3
  • closureVariableWorkflow - nested step functions with closure variables | wrun_01KRNMMBZFN7DAT9JBEGV5PP2T
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step | wrun_01KRNMME68VR8PZRQ8QCDEB7PY
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly | wrun_01KRNMMYP9KRX83BMHJ9SPM5NG
  • Calculator.calculate - static workflow method using static step methods from another class | wrun_01KRNMN4GCXAFX28BPVJK77ZHX
  • AllInOneService.processNumber - static workflow method using sibling static step methods | wrun_01KRNMNB7YZ21CNZ7HW9Z2YMH3
  • ChainableService.processWithThis - static step methods using this to reference the class | wrun_01KRNMNJ4GRF9H58KC7460XH3N
  • thisSerializationWorkflow - step function invoked with .call() and .apply() | wrun_01KRNMNSD930522T6EFJ1T5YCV
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE | wrun_01KRNMP0AFVMVCPNFV5B9B6F76
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KRNMP72HB8FQ8MWN9MS5RQJQ
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KRNMPJH0F136EDM28HRCKM0A
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument | wrun_01KRNMPVM74SKYE7PC6NWEX8TT
  • cancelRun - cancelling a running workflow | wrun_01KRNMQ95A0AFPGJNN91PKVZ0B
  • cancelRun via CLI - cancelling a running workflow | wrun_01KRNMQJN6W77V25H54BYHCGC0
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KRNMQZSQHCQETX2PEC57B5PX
  • sleepInLoopWorkflow - sleep inside loop with steps actually delays each iteration | wrun_01KRNMRFVQZBAHHREHX13MET0C
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control) | wrun_01KRNMRTJ3MMD2K9AE7K61TDAD
  • importMetaUrlWorkflow - import.meta.url is available in step bundles | wrun_01KRNMS1DW8YPPVAV813T99TXB
  • metadataFromHelperWorkflow - getWorkflowMetadata/getStepMetadata work from module-level helper (#1577) | wrun_01KRNMS3KMXNB2A5TKEMX4PD6W
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KRNMS5SRRTWDQW3357XX328T

Details by Category

✅ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 81 0 7
✅ example 81 0 7
✅ express 81 0 7
✅ fastify 81 0 7
✅ hono 81 0 7
✅ nextjs-turbopack 86 0 2
✅ nextjs-webpack 86 0 2
✅ nitro 81 0 7
✅ nuxt 81 0 7
✅ sveltekit 81 0 7
✅ vite 81 0 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 88 0 0
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 5 0 0
❌ mongodb 57 12 0
✅ redis-dev 5 0 0
❌ redis 60 9 0
✅ turso-dev 5 0 0
❌ turso 4 65 0
✅ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 82 0 6
✅ e2e-local-dev-tanstack-start-stable 82 0 6
✅ e2e-local-postgres-nest-stable 82 0 6
✅ e2e-local-postgres-tanstack-start-stable 82 0 6
✅ e2e-local-prod-nest-stable 82 0 6
✅ e2e-local-prod-tanstack-start-stable 82 0 6

📋 View full workflow run

Comment thread packages/utils/src/get-port.test.ts Outdated
expect(port).toBe(fastAddr.port);
// Should complete reasonably quickly (Windows CI can be slow)
expect(elapsed).toBeLessThan(2000);
expect(elapsed).toBeLessThan(5000);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5s is too long - what's actually causing the get port to take so long and can we make this faster?

Copy link
Copy Markdown
Contributor Author

@pranaygp pranaygp May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that 5s bump was masking the wrong thing. The slow part was Windows netstat / process-port discovery, not the HTTP probe timeout itself.

I changed the test to pass explicit candidatePorts, so it bypasses OS discovery and measures the custom probe timeout directly. The assertion is back to less than 2000ms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant